Goto

Collaborating Authors

 kurland & lee


The Opposite of Smoothing: A Language Model Approach to Ranking Query-Specific Document Clusters

Journal of Artificial Intelligence Research

Exploiting information induced from (query-specific) clustering of top-retrieved documents has long been proposed as a means for improving precision at the very top ranks of the returned results. We present a novel language model approach to ranking query-specific clusters by the presumed percentage of relevant documents that they contain. While most previous cluster ranking approaches focus on the cluster as a whole, our model utilizes also information induced from documents associated with the cluster. Our model substantially outperforms previous approaches for identifying clusters containing a high relevant-document percentage. Furthermore, using the model to produce document ranking yields precision-at-top-ranks performance that is consistently better than that of the initial ranking upon which clustering is performed. The performance also favorably compares with that of a state-of-the-art pseudo-feedback-based retrieval method.


From ``Identical'' to ``Similar'': Fusing Retrieved Lists Based on Inter-Document Similarities

Journal of Artificial Intelligence Research

Methods for fusing document lists that were retrieved in response to a query often utilize the retrieval scores and/or ranks of documents in the lists. We present a novel fusion approach that is based on using, in addition, information induced from inter-document similarities. Specifically, our methods let similar documents from different lists provide relevance-status support to each other. We use a graph-based method to model relevance-status propagation between documents. The propagation is governed by inter-document-similarities and by retrieval scores of documents in the lists. Empirical evaluation demonstrates the effectiveness of our methods in fusing TREC runs. The performance of our most effective methods transcends that of effective fusion methods that utilize only retrieval scores or ranks.